Introduction Originally used to reduce storage and administration costs in mainframe and mid-sized systems, Hierarchical Storage Management (HSM) is now helping manage data on LANs. The theory behind HSM is the same for mainframes and LANs alike, primary on-line storage is costly if not managed efficiently. In LAN environments, the burden of storage management is heightened as it effects not only the network managers but nearly every user as well. Time spent dealing with the capacity of server and workstation volumes can add up to a loss far greater than the price of the disks themselves. By completely automating the management of disk capacity, HSM can deliver the benefits of near-infinite storage while minimizing cost, labor, and overhead. This White Paper illustrates how HSM provides the best solution to the never-ending need to increase LAN storage. Refer to the glossary on the last page of this White Paper for a description of the terms used herein. HSM: The first effective solution to the LAN storage dilemma Disk volumes on LAN servers always fill up - it's almost a law of physics. Until recently, there have been only three ways to deal with the problem when it happens: Approach #1 Buy more disk. This solution does solve the problem, but at a price. Disk storage on a server typically costs between $1,000 and $2,000 per gigabyte and twice that for mirroring, making continuing disk expansion rather expensive. But the disk drive is only the beginning, you also need controllers, cables, power, and physical space. Furthermore, as the on-line capacity grows, more RAM is needed for the file server to service disk cache, more disk connections are used in both software and hardware, and more and more backup systems are needed to protect the additional hard drive volumes of data. Approach #2 Have the users remove "clutter." When buying more disks isn't a practical option, and the network manager doesn't have the time or appropriate tools to manage data, this is the only remaining option. While its apparent costs are zero (no disk or special software to purchase), this option is generally the least reliable and most expensive. Manual deletion of files is time consuming and inherently risky. Users may inadvertently delete files which are valuable, or they may remove so few files as to make the exercise pointless. It may take days for an administrator to convince users to clean up their act,' only to find they have also removed applications, important data files or application configuration files. While cooperation may have yielded some additional disk space, more time must be spent putting the system back in shape. Approach #3 Migrate the files to off-line storage. This approach involves making extra off-line copies of inactive files, and then removing them from the on-line storage. This does allow space to be reclaimed in a reasonable time period, but has the drawback that user access to the migrated data is inconvenient at best. In most cases the users must request specific files from the system administrator and then wait for the data to be found in the off-line archives and then hopefully be restored. Often, users will not know the specific name or location of migrated data, and a more imprecise and often fruitless task of finding the data on off-line media must begin. This process may take hours or even days. And as the amount of migrated data grows, the recall requests can become unmanageable. The HSM Alternative HSM (Hierarchical Storage Management) is an automated process that provides the best solution to the never ending demand for increases in LAN storage. By moving inactive data to less expensive secondary storage, and recalling it automatically when needed, users receive the benefits of near-infinite disk storage without the cost. Figure 1. Storage Tiers Compared to approach #1 (buying more disk), HSM provides the same 100% accessibility at a much lower cost. Compared to approach #3 (migrating old data), HSM provides the same cost savings without the administrative burden or inconvenience to the user. Compared to approach #2 (have users remove clutter), HSM allows users to go on doing productive work and handles the storage management tasks automatically and safely. Most HSM systems perform the three following basic functions: Pre-staging of inactive data to secondary storage. Some form of secondary storage is used. Typically, this takes the form of a tape autoloader, optical jukebox, a device such as a large tape drive or some combination of these devices. A key characteristic of the secondary storage is a cost per gigabyte much lower than that of magnetic disk (primary storage). The HSM system is responsible for moving copies of inactive files into the secondary storage, in anticipation of the need to remove them from primary storage as the disk volumes fill. (Data is pre-staged to avoid the performance burden of transferring large amounts of data during peak usage hours.) Monitoring of primary storage volumes with migration as needed. Throughout the day and night, the HSM system monitors the volumes it services. If the amount of data on the volume exceeds the configurable "high water mark" (usually expressed as a percentage of total disk space), migration will occur. If pre-staging has been used, this migration requires nothing more than selecting the right files to migrate, shortening them to a very small "phantom" or "stub" file as a place holder, and stopping the migration when the specified "low water mark" is reached. Automatic recall of files as needed. The users continue to see all data as on-line. When a file is accessed, a recall agent, resident in the client or in the server, initiates the process of moving the file from secondary storage back to primary storage. These three processes combined allow for rapid LAN storage growth without the expense of on-line disk or the administrative overhead of manual storage management. As the disks fill, inactive files are moved off to less expensive storage, but remain accessible to users without administrator intervention. Why is HSM for LANs such a visible topic now? Several independent trends have converged to make HSM attractive for LANs. Size of LAN storage LAN data and storage continues to grow exponentially. Even with the decreasing costs of magnetic disks, keeping up with the storage growth is an expensive proposition. Beyond the costs of the disks themselves, issues with server configuration (RAM, SCSI ports, slots) and the time and cost involved in management of larger storage systems are creating a greater need for an HSM solution. Figure 2. Typical LAN Capacity Profile Availability of robotic storage devices The last two years have seen an increase in availability and a decrease in cost of high-capacity autoloaders and jukeboxes. These robotic devices have matured in terms of both dependability and programmability, making them more reponsive to sophisticated software control. The ready availability of low-cost secondary storage is a prerequisite to HSM; the enhanced capabilities for these devices make them ideal storage management components. Availability of HSM software While file migration software (with manual restores) has been around for some time, truly automated HSM software has just recently been introduced to NetWare LANs in 1993. The automation of a true HSM system greatly improves usability of secondary storage. HSM is not a new concept. HSM systems, both automated and semi-automated, have served mainframe and minicomputer platforms for years. As LANs grow in complexity and storage requirements to the level of mainframe systems, the need for HSM becomes more apparent. The Palindrome HSM Software Approach: Integrating Backup, Archiving, and HSM Palindrome HSM Software operates in conjunction with Palindrome's award-winning Network Archivist backup and archiving software. Palindrome HSM Software builds on the intelligent storage management architecture of Network Archivist to provide a flexible, scalable HSM solution applicable to a wide range of LAN environments. In addition, the tight integration of Palindrome HSM Software with Network Archivist provides a total storage management environment, capable of managing the entire backup, archiving, and HSM functions in one seamless, robust, and easy-to-administer package. This integration provides superior reliability, it also allows surprisingly affordable implementation. Further, the modularity of the system approach also allows LAN sites to customize and configure their storage management environment to fit their individual needs and to grow as LAN size increases. Figure 3. Palindrome HSM Software Architecture Palindrome HSM Software is composed of Palindrome HSM Volume Monitor and Palindrome Client Recall Agents. Palindrome HSM Volume Monitor Residing as an NLM on NetWare servers, the Palindrome Volume Monitor checks the disk capacity of the servers under its protection at configurable intervals to determine how full the storage devices have become. When any disk exceeds its high-water mark - a level which the administrator has decided would be too full - the Palindrome system automatically converts data on the disk (which is eligible for migration and already pre-staged onto secondary media) to zero-byte phantom files. This migration of pre-staged data continues until the disk reaches its low water mark - a percentage of disk which is acceptably full and allows ample room for the server to continue running effectively. The monitoring capability of Palindrome HSM Software is extremely powerful and flexible. High and low water marks can be set individually for each server volume, permitting customized storage management according to a volume's capacity, storage requirements and disk activity. And prior to migration, files are permanently archived to ensure ability to restore. The HSM migration process leaves behind zero-byte phantom files in the process of removing eligible files from storage - the file's original file name remains on the server, but the file itself has been removed. This phantom file is key to recalling the data at a later time. The order in which data is migration can also be customized in a number of ways: Least recently used. If the monitor is set for this option, the least recently used data is migrated first until the low water mark is reached. Files that have not been accessed the longest are less likely to be needed immediately and re good candidates for moving off primary storage. This option is easiest to understand, and to explain, and is the default option in Palindrome HSM. This option, however, may not be the best for all computing environments. Largest first. If this option is chosen, all files that are eligible for migration are sorted by size and deleted from storage on that priority until the low water mark is reached. In any environment, regardless of applications used, this option always migrates the fewest possible files to attain its migration goals and therefore reduces any future restore requests. Fewer migrations and fewer restores result in increased system performance. Most Eligible. Depending on how an administrator sets the rules for migrating files, Palindrome HSM software can determine which files are most eligible to be migrated. For example, files that have a "quick" migrate date - say a few weeks - would be more eligible than files with 52-week migration eligibility dates. The system can be set to prioritize migrations in just such a manner. Those files which are "more eligible" are the first removed (i.e. their migration dates far exceed their migration rule). Pre-staged Files and Eligible for Migration' It's important to note that one of Palindrome HSM's strengths comes from its tight integration with the Network Archivist software. With file protection rules set in the backup and archiving software, files are redundantly protected across more than one piece of secondary storage, insuring that these files will remain protected even if one tape or optical disk is damaged, or a complete site disaster occurs. Once files are protected redundantly and reach the administrator's criteria for migration (not accessed for 12 weeks, for example), these eligible files can be simply converted to phantom files on the server as they've already been pre-staged to secondary media through the Network Archivist system of archiving. This instant' migration of eligible files from the server is an important advantage - no network traffic is generated moving files to secondary storage, and no separate migration operations need ever be performed. Palindrome Client Recall Agents Whether an attached workstation on the network is running DOS, Windows, or OS/2, the Palindrome Recall Agent is laded into the machine's memory at start-up. These agents run in the background waiting for the client's access of phantom files - zero-byte files left behind as place markers for migrated files (see figure 3, page 4). Users can access these files even within applications, calling up a migrated text file, for example, from within a word processor. When the filename is accessed, the recall agent steps in. First, the user is notified via a pop-up window that the file has been migrated and automatically queues it for restore. The recall agent submits the filename into the Palindrome queue where it is referenced in the Palindrome File History Database for immediate retrieval. User Notification and Control No HSM system can be successful if it causes confusion or frustration for the users. When a file recall is in progress, a pop-up informs the user, showing the file name, queue status, and an activity indicator. The user has three options, to do nothing and let the recall complete, to continue without waiting (useful in recalling groups of files), or to delete the recall request. If installed, Palindrome's File Manager can allow users to initiate the recall of a whole project, rather than requesting the files sequentially via the recall agent. Scalable HSM Because Palindrome delivers an integrated approach to a storage management environment, implementing HSM on a LAN can be done gradually, as storage requirements demand and budget allows. A business can purchase Network Archivist for backup and archiving to tape, customize the Archivist rules over a period of time to get the optimal protection for their environment, and then move into HSM in their own way. Many LANs can deploy basic HSM functionality with a single tape drive or pair of drives, for example. Due to Network Archivist's flexible media handling capabilities, one tape drive can be designated to handle incremental backup data while another tape drive is set to handle near-line storage. This larger tape can remain permanently in the drive to service demigration requests, adding 4-10 gigabytes of storage at low cost. As LAN primary disk and migrated data demands grow, the administrator can add an optical disc, a tape autoloader, and optical jukebox or any combination - without re-implementation or lengthy configuration - to extend the reach and depth of the HSM services. Conclusion Palindrome is the LAN industry leader in automated storage management. Plans and accommodations for the current HSM products were a part of Network Archivist's growth path since the shipment of Network Archivist in 1989. The thoughtful integration of HSM into the current product line is a natural evolution of the Palindrome storage management philosophy. The power, flexibility, and economy of the Palindrome HSM system make it the most responsive to the local area network environment, now and in the future. Glossary On-Line: hard disk drive (often referred to as primary storage). Near-line: mechanically available data, usually stored on tapes or optical media; in an HSM system, end-users have no need to know whether a file resides on primary or near-line storage (often referred to as secondary storage). Off-line: media held in a vault or on a shelf and not immediately available to the storage management software. Pre-stage: to make copies of data eligible for migration onto secondary media. Once the data is protected redundantly, on multiple media, it can simply be removed from primary storage, as it has already been "moved" to secondary through pre-staging. Migrate: to move data from one storage media to another, usually lower in the hierarchy. Phantom file: a filename, occupying zero bytes, which stands as a placeholder for data which has been migrated from primary storage. Recall: opposite of migrate; bring back to primary storage.